{
 "cells": [
  {
   "cell_type": "markdown",
   "source": [
    "3.3 数据取值与选择  \n",
    "第2章具体介绍了获取、设置、调整NumPy数组数值的方法与工具，包括取值操作（如arr[2, 1]）、切片操作（如arr[:, 1:5]）、掩码操作（如arr[arr > 0]）、花哨的索引操作（如arr[0, [1, 5]]），以及组合操作（如arr[:, [1, 5]]）。下面介绍Pandas的Series和DataFrame对象相似的数据获取与调整操作  \n",
    "Series数据选择方法  \n",
    "1. 将Series看作字典  \n",
    "2. 将Series看作一维数组  \n",
    "3. 索引器：loc、iloc和ix  "
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "source": [
    "import pandas as pd\n",
    "data = pd.Series([0.25,0.5,0.75,1.0], index= ['a', 'b', 'c', 'd'])\n",
    "data"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "a    0.25\n",
       "b    0.50\n",
       "c    0.75\n",
       "d    1.00\n",
       "dtype: float64"
      ]
     },
     "metadata": {},
     "execution_count": 1
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "source": [
    "data['b']"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "0.5"
      ]
     },
     "metadata": {},
     "execution_count": 2
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "source": [
    "'a' in data"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "metadata": {},
     "execution_count": 3
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "source": [
    "data.keys()"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "Index(['a', 'b', 'c', 'd'], dtype='object')"
      ]
     },
     "metadata": {},
     "execution_count": 4
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "source": [
    "list(data.items())"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "[('a', 0.25), ('b', 0.5), ('c', 0.75), ('d', 1.0)]"
      ]
     },
     "metadata": {},
     "execution_count": 5
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "Series对象还可以用字典语法调整数据。就像你可以通过增加新的键扩展字典一样，你也可以通过增加新的索引值扩展Series："
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "source": [
    "data['e'] = 1.25\n",
    "data"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "a    0.25\n",
       "b    0.50\n",
       "c    0.75\n",
       "d    1.00\n",
       "e    1.25\n",
       "dtype: float64"
      ]
     },
     "metadata": {},
     "execution_count": 6
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "Series对象的可变性是一个非常方便的特性：Pandas在底层已经为可能发生的内存布局和数据复制自动决策，用户不需要担心这些问题"
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "source": [
    "data['a':'c']"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "a    0.25\n",
       "b    0.50\n",
       "c    0.75\n",
       "dtype: float64"
      ]
     },
     "metadata": {},
     "execution_count": 7
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "source": [
    "data[0:2]"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "a    0.25\n",
       "b    0.50\n",
       "dtype: float64"
      ]
     },
     "metadata": {},
     "execution_count": 8
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "source": [
    "data[(data>0.3)&(data<0.8)]"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "b    0.50\n",
       "c    0.75\n",
       "dtype: float64"
      ]
     },
     "metadata": {},
     "execution_count": 9
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "source": [
    "data[['a', 'e']]"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "a    0.25\n",
       "e    1.25\n",
       "dtype: float64"
      ]
     },
     "metadata": {},
     "execution_count": 10
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "source": [
    "data = pd.Series(['a', 'b', 'c'], index=[1,3,5])\n",
    "data"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "1    a\n",
       "3    b\n",
       "5    c\n",
       "dtype: object"
      ]
     },
     "metadata": {},
     "execution_count": 11
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "source": [
    "data[1]"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "'a'"
      ]
     },
     "metadata": {},
     "execution_count": 12
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "source": [
    "data[1:3]"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "3    b\n",
       "5    c\n",
       "dtype: object"
      ]
     },
     "metadata": {},
     "execution_count": 13
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "source": [
    "data.loc[1]"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "'a'"
      ]
     },
     "metadata": {},
     "execution_count": 14
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "source": [
    "data.loc[1:3]  #第一种索引器是loc属性，表示取值和切片都是显式的："
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "1    a\n",
       "3    b\n",
       "dtype: object"
      ]
     },
     "metadata": {},
     "execution_count": 15
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "source": [
    "data.iloc[1]  #第二种是iloc属性，表示取值和切片都是Python形式的1隐式索引：\n",
    "data.iloc[1:3]"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "3    b\n",
       "5    c\n",
       "dtype: object"
      ]
     },
     "metadata": {},
     "execution_count": 17
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "第三种取值属性是ix，它是前两种索引器的混合形式，在Series对象中ix等价于标准的[]（Python列表）取值方式。ix索引器主要用于DataFrame对象，后面将会介绍。Python代码的设计原则之一是“显式优于隐式”。使用loc和iloc可以让代码更容易维护，可读性更高。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "DataFrame数据选择方法 \n",
    " 前面曾提到，DataFrame在有些方面像二维或结构化数组，在有些方面又像一个共享索引的若干Series对象构成的字典。这两种类比可以帮助我们更好地掌握这种数据结构的数据选择方法。  \n",
    " 1. 将DataFrame看作字典  \n",
    " 2. 将DataFrame看作二维数组  \n",
    "3. 其他取值方法"
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "source": [
    "area = pd.Series({'California': 423967, 'Texas': 695662,               \n",
    "            'New York': 141297, 'Florida': 170312,      \n",
    "                                 'Illinois': 149995})         \n",
    "pop = pd.Series({'California': 38332521, 'Texas': 26448193,      \n",
    "                    'New York': 19651127, 'Florida': 19552860,        \n",
    "                                      'Illinois': 12882135})         \n",
    "data = pd.DataFrame({'area':area, 'pop':pop})         \n",
    "data"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>area</th>\n",
       "      <th>pop</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>California</th>\n",
       "      <td>423967</td>\n",
       "      <td>38332521</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Texas</th>\n",
       "      <td>695662</td>\n",
       "      <td>26448193</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>New York</th>\n",
       "      <td>141297</td>\n",
       "      <td>19651127</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Florida</th>\n",
       "      <td>170312</td>\n",
       "      <td>19552860</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Illinois</th>\n",
       "      <td>149995</td>\n",
       "      <td>12882135</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              area       pop\n",
       "California  423967  38332521\n",
       "Texas       695662  26448193\n",
       "New York    141297  19651127\n",
       "Florida     170312  19552860\n",
       "Illinois    149995  12882135"
      ]
     },
     "metadata": {},
     "execution_count": 18
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "两个Series分别构成DataFrame的一列，可以通过对列名进行字典形式（dictionary-style）的取值获取数据"
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "source": [
    "data['area']"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "California    423967\n",
       "Texas         695662\n",
       "New York      141297\n",
       "Florida       170312\n",
       "Illinois      149995\n",
       "Name: area, dtype: int64"
      ]
     },
     "metadata": {},
     "execution_count": 19
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "同样，也可以用属性形式（attribute-style）选择纯字符串列名的数据："
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "source": [
    "data.area"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "California    423967\n",
       "Texas         695662\n",
       "New York      141297\n",
       "Florida       170312\n",
       "Illinois      149995\n",
       "Name: area, dtype: int64"
      ]
     },
     "metadata": {},
     "execution_count": 20
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "如果列名不是纯字符串，或者列名与DataFrame的方法同名，那么就不能用属性索引。  \n",
    "另外，还应该避免对用属性形式选择的列直接赋值（即可以用data['pop'] = z，但不要用data.pop = z）。  \n"
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "source": [
    "data['density'] = data['pop'] / data['area']\n",
    "data"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>area</th>\n",
       "      <th>pop</th>\n",
       "      <th>density</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>California</th>\n",
       "      <td>423967</td>\n",
       "      <td>38332521</td>\n",
       "      <td>90.413926</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Texas</th>\n",
       "      <td>695662</td>\n",
       "      <td>26448193</td>\n",
       "      <td>38.018740</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>New York</th>\n",
       "      <td>141297</td>\n",
       "      <td>19651127</td>\n",
       "      <td>139.076746</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Florida</th>\n",
       "      <td>170312</td>\n",
       "      <td>19552860</td>\n",
       "      <td>114.806121</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Illinois</th>\n",
       "      <td>149995</td>\n",
       "      <td>12882135</td>\n",
       "      <td>85.883763</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              area       pop     density\n",
       "California  423967  38332521   90.413926\n",
       "Texas       695662  26448193   38.018740\n",
       "New York    141297  19651127  139.076746\n",
       "Florida     170312  19552860  114.806121\n",
       "Illinois    149995  12882135   85.883763"
      ]
     },
     "metadata": {},
     "execution_count": 22
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "source": [
    "data.values"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "array([[4.23967000e+05, 3.83325210e+07, 9.04139261e+01],\n",
       "       [6.95662000e+05, 2.64481930e+07, 3.80187404e+01],\n",
       "       [1.41297000e+05, 1.96511270e+07, 1.39076746e+02],\n",
       "       [1.70312000e+05, 1.95528600e+07, 1.14806121e+02],\n",
       "       [1.49995000e+05, 1.28821350e+07, 8.58837628e+01]])"
      ]
     },
     "metadata": {},
     "execution_count": 23
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "source": [
    "data.T"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>California</th>\n",
       "      <th>Texas</th>\n",
       "      <th>New York</th>\n",
       "      <th>Florida</th>\n",
       "      <th>Illinois</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>area</th>\n",
       "      <td>4.239670e+05</td>\n",
       "      <td>6.956620e+05</td>\n",
       "      <td>1.412970e+05</td>\n",
       "      <td>1.703120e+05</td>\n",
       "      <td>1.499950e+05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>pop</th>\n",
       "      <td>3.833252e+07</td>\n",
       "      <td>2.644819e+07</td>\n",
       "      <td>1.965113e+07</td>\n",
       "      <td>1.955286e+07</td>\n",
       "      <td>1.288214e+07</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>density</th>\n",
       "      <td>9.041393e+01</td>\n",
       "      <td>3.801874e+01</td>\n",
       "      <td>1.390767e+02</td>\n",
       "      <td>1.148061e+02</td>\n",
       "      <td>8.588376e+01</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           California         Texas      New York       Florida      Illinois\n",
       "area     4.239670e+05  6.956620e+05  1.412970e+05  1.703120e+05  1.499950e+05\n",
       "pop      3.833252e+07  2.644819e+07  1.965113e+07  1.955286e+07  1.288214e+07\n",
       "density  9.041393e+01  3.801874e+01  1.390767e+02  1.148061e+02  8.588376e+01"
      ]
     },
     "metadata": {},
     "execution_count": 24
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "source": [
    "data.values[0]"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "array([4.23967000e+05, 3.83325210e+07, 9.04139261e+01])"
      ]
     },
     "metadata": {},
     "execution_count": 25
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "source": [
    "#而获取一列数据就需要向DataFrame传递单个列索引\n",
    "data['area']"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "California    423967\n",
       "Texas         695662\n",
       "New York      141297\n",
       "Florida       170312\n",
       "Illinois      149995\n",
       "Name: area, dtype: int64"
      ]
     },
     "metadata": {},
     "execution_count": 26
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "因此，在进行数组形式的取值时，我们就需要用另一种方法——前面介绍过的Pandas索引器loc、iloc和ix了。通过iloc索引器，我们就可以像对待NumPy数组一样索引Pandas的底层数组（Python的隐式索引），DataFrame的行列标签会自动保留在结果中："
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "source": [
    "data.iloc[:3, :2]"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>area</th>\n",
       "      <th>pop</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>California</th>\n",
       "      <td>423967</td>\n",
       "      <td>38332521</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Texas</th>\n",
       "      <td>695662</td>\n",
       "      <td>26448193</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>New York</th>\n",
       "      <td>141297</td>\n",
       "      <td>19651127</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              area       pop\n",
       "California  423967  38332521\n",
       "Texas       695662  26448193\n",
       "New York    141297  19651127"
      ]
     },
     "metadata": {},
     "execution_count": 27
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "source": [
    "data.loc[:'Illinois', :'pop']"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>area</th>\n",
       "      <th>pop</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>California</th>\n",
       "      <td>423967</td>\n",
       "      <td>38332521</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Texas</th>\n",
       "      <td>695662</td>\n",
       "      <td>26448193</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>New York</th>\n",
       "      <td>141297</td>\n",
       "      <td>19651127</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Florida</th>\n",
       "      <td>170312</td>\n",
       "      <td>19552860</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Illinois</th>\n",
       "      <td>149995</td>\n",
       "      <td>12882135</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              area       pop\n",
       "California  423967  38332521\n",
       "Texas       695662  26448193\n",
       "New York    141297  19651127\n",
       "Florida     170312  19552860\n",
       "Illinois    149995  12882135"
      ]
     },
     "metadata": {},
     "execution_count": 28
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "source": [
    "data.ix[:3, :'pop']  #使用ix索引器可以实现一种混合效果  1.0.0之后的版本已经没有这个方法了"
   ],
   "outputs": [
    {
     "output_type": "error",
     "ename": "AttributeError",
     "evalue": "'DataFrame' object has no attribute 'ix'",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mAttributeError\u001b[0m                            Traceback (most recent call last)",
      "\u001b[0;32m/tmp/ipykernel_14094/1486857625.py\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mix\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m:\u001b[0m\u001b[0;34m'pop'\u001b[0m\u001b[0;34m]\u001b[0m  \u001b[0;31m#使用ix索引器可以实现一种混合效果\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;32m~/ubuntu_data/software/miniconda3/envs/d2l-zh/lib/python3.8/site-packages/pandas/core/generic.py\u001b[0m in \u001b[0;36m__getattr__\u001b[0;34m(self, name)\u001b[0m\n\u001b[1;32m   5476\u001b[0m         ):\n\u001b[1;32m   5477\u001b[0m             \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mname\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 5478\u001b[0;31m         \u001b[0;32mreturn\u001b[0m \u001b[0mobject\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__getattribute__\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mname\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   5479\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   5480\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0m__setattr__\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mname\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mstr\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvalue\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mAttributeError\u001b[0m: 'DataFrame' object has no attribute 'ix'"
     ]
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "source": [
    "#任何用于处理NumPy形式数据的方法都可以用于这些索引器。例如，可以在loc索引器中结合使用掩码与花哨的索引方法：\n",
    "data.loc[data.density>100, ['pop', 'density']]"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>pop</th>\n",
       "      <th>density</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>New York</th>\n",
       "      <td>19651127</td>\n",
       "      <td>139.076746</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Florida</th>\n",
       "      <td>19552860</td>\n",
       "      <td>114.806121</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "               pop     density\n",
       "New York  19651127  139.076746\n",
       "Florida   19552860  114.806121"
      ]
     },
     "metadata": {},
     "execution_count": 30
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "source": [
    "data.iloc[0,2] = 90\n",
    "data"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>area</th>\n",
       "      <th>pop</th>\n",
       "      <th>density</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>California</th>\n",
       "      <td>423967</td>\n",
       "      <td>38332521</td>\n",
       "      <td>90.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Texas</th>\n",
       "      <td>695662</td>\n",
       "      <td>26448193</td>\n",
       "      <td>38.018740</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>New York</th>\n",
       "      <td>141297</td>\n",
       "      <td>19651127</td>\n",
       "      <td>139.076746</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Florida</th>\n",
       "      <td>170312</td>\n",
       "      <td>19552860</td>\n",
       "      <td>114.806121</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Illinois</th>\n",
       "      <td>149995</td>\n",
       "      <td>12882135</td>\n",
       "      <td>85.883763</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              area       pop     density\n",
       "California  423967  38332521   90.000000\n",
       "Texas       695662  26448193   38.018740\n",
       "New York    141297  19651127  139.076746\n",
       "Florida     170312  19552860  114.806121\n",
       "Illinois    149995  12882135   85.883763"
      ]
     },
     "metadata": {},
     "execution_count": 32
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "source": [
    "data['Florida':'Illinois']"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>area</th>\n",
       "      <th>pop</th>\n",
       "      <th>density</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Florida</th>\n",
       "      <td>170312</td>\n",
       "      <td>19552860</td>\n",
       "      <td>114.806121</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Illinois</th>\n",
       "      <td>149995</td>\n",
       "      <td>12882135</td>\n",
       "      <td>85.883763</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            area       pop     density\n",
       "Florida   170312  19552860  114.806121\n",
       "Illinois  149995  12882135   85.883763"
      ]
     },
     "metadata": {},
     "execution_count": 33
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "source": [
    "data[1:3]"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>area</th>\n",
       "      <th>pop</th>\n",
       "      <th>density</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Texas</th>\n",
       "      <td>695662</td>\n",
       "      <td>26448193</td>\n",
       "      <td>38.018740</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>New York</th>\n",
       "      <td>141297</td>\n",
       "      <td>19651127</td>\n",
       "      <td>139.076746</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            area       pop     density\n",
       "Texas     695662  26448193   38.018740\n",
       "New York  141297  19651127  139.076746"
      ]
     },
     "metadata": {},
     "execution_count": 34
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "source": [
    "#与之类似，掩码操作也可以直接对每一行进行过滤，而不需要使用loc索引器\n",
    "data[data.density>100]"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>area</th>\n",
       "      <th>pop</th>\n",
       "      <th>density</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>New York</th>\n",
       "      <td>141297</td>\n",
       "      <td>19651127</td>\n",
       "      <td>139.076746</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Florida</th>\n",
       "      <td>170312</td>\n",
       "      <td>19552860</td>\n",
       "      <td>114.806121</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            area       pop     density\n",
       "New York  141297  19651127  139.076746\n",
       "Florida   170312  19552860  114.806121"
      ]
     },
     "metadata": {},
     "execution_count": 35
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [],
   "outputs": [],
   "metadata": {}
  }
 ],
 "metadata": {
  "orig_nbformat": 4,
  "language_info": {
   "name": "python",
   "version": "3.8.10",
   "mimetype": "text/x-python",
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "pygments_lexer": "ipython3",
   "nbconvert_exporter": "python",
   "file_extension": ".py"
  },
  "kernelspec": {
   "name": "python3",
   "display_name": "Python 3.8.10 64-bit ('d2l-zh': conda)"
  },
  "interpreter": {
   "hash": "0bedf3452ed48e38a2b75ff9f66ce8ae9ae2e92e8665cb618ee0797d2f46b9e5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}